Search CORE

301 research outputs found

GNN-encoder: Learning a Dual-encoder Architecture via Graph Neural Networks for Passage Retrieval

Author: Liu Jiahao
Liu Jiduan
Wang Jingang
Wu Wei
Yan Rui
Yang Yang
Zhao Dongyan
Publication venue
Publication date: 18/04/2022
Field of study

Recently, retrieval models based on dense representations are dominant in passage retrieval tasks, due to their outstanding ability in terms of capturing semantics of input text compared to the traditional sparse vector space models. A common practice of dense retrieval models is to exploit a dual-encoder architecture to represent a query and a passage independently. Though efficient, such a structure loses interaction between the query-passage pair, resulting in inferior accuracy. To enhance the performance of dense retrieval models without loss of efficiency, we propose a GNN-encoder model in which query (passage) information is fused into passage (query) representations via graph neural networks that are constructed by queries and their top retrieved passages. By this means, we maintain a dual-encoder structure, and retain some interaction information between query-passage pairs in their representations, which enables us to achieve both efficiency and efficacy in passage retrieval. Evaluation results indicate that our method significantly outperforms the existing models on MSMARCO, Natural Questions and TriviaQA datasets, and achieves the new state-of-the-art on these datasets.Comment: 11 pages, 6 figure

arXiv.org e-Print Archive

To what extent can control policies influence the epidemic spreading? -- A data-driven analysis based on the first wave of COVID-19

Author: Hong Liu
Peng Liangrong
Wen Wanqi
Yang Wuyue
Zhang Dongyan
Zhuge Changjingn
Publication venue
Publication date: 31/05/2023
Field of study

On May 5th, 2023, WHO declared an end to the global COVID-19 public health emergency, which means a significant transition from global critical emergency response activities to long-term sustained COVID-19 prevention and control. At this very moment, we make a comprehensive review on various control policies taken by 127 countries/territories during the first wave of COVID-19 pandemic until July 2nd, 2020, and evaluate their impacts on the epidemic dynamics in a quantitative way through both linear and nonlinear regressions. Through our analyses, the intrinsic correlations between the strength of control policies and the dynamical characteristics of COVID-19 epidemics are revealed not only for every country/territory under consideration, but also in a global view. Our results may help to design more economical and more effective preventive measures during the long-term fight against COVID-19 in the future.Comment: 17 pages, 5 figures, 2 table

arXiv.org e-Print Archive

PreQuant: A Task-agnostic Quantization Approach for Pre-trained Language Models

Author: Gong Zhuocheng
Liu Jiahao
Wang Jingang
Wang Qifan
Wu Wei
Xian Yunsen
Yan Rui
Yang Yang
Zhao Dongyan
Publication venue
Publication date: 30/05/2023
Field of study

While transformer-based pre-trained language models (PLMs) have dominated a number of NLP applications, these models are heavy to deploy and expensive to use. Therefore, effectively compressing large-scale PLMs becomes an increasingly important problem. Quantization, which represents high-precision tensors with low-bit fix-point format, is a viable solution. However, most existing quantization methods are task-specific, requiring customized training and quantization with a large number of trainable parameters on each individual task. Inspired by the observation that the over-parameterization nature of PLMs makes it possible to freeze most of the parameters during the fine-tuning stage, in this work, we propose a novel ``quantize before fine-tuning'' framework, PreQuant, that differs from both quantization-aware training and post-training quantization. PreQuant is compatible with various quantization strategies, with outlier-aware parameter-efficient fine-tuning incorporated to correct the induced quantization error. We demonstrate the effectiveness of PreQuant on the GLUE benchmark using BERT, RoBERTa, and T5. We also provide an empirical investigation into the workflow of PreQuant, which sheds light on its efficacy.Comment: Findings of ACL202

arXiv.org e-Print Archive